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ABSTRACT 

Weak gravitational lensing surveys using photometric redshifts can have their cosmological con- 
straints severely degraded by errors in the photo-z scale. We explore the cosmological degradation 
versus the size of the spectroscopic survey required to calibrate the photo-z probability distribution. 
Previous work has assumed a simple Gaussian distribution of photo-z errors; here, we describe a 
method for constraining an arbitrary parametric photo-z error model. As an example we allow the 
photo-z probability distribution to be the sum of N g Gaussians. To limit cosmological degradation 
to a fixed level, photo-z distributions comprised of multiple Gaussians require up to a 5 times larger 
calibration sample than one would estimate from assuming a single-Gaussian model. This degradation 
saturates at N g ~ 4 in the simple case where the fiducial distribution is independent of N g . Assuming 
a single Gaussian when the photo-z distribution has multiple parameters underestimates cosmological 
parameter uncertainties by up to 35%. The size of required calibration sample also depends on the 
shape of the fiducial distribution, even when the rms photo-z error is held fixed. The required calibra- 
tion sample size varies up to a factor of 40 among the fiducial models studied, but this is reduced to a 
factor of a few if the photo-z error distributions are forced to be slowly varying with redshift. Finally, 
we show that the size of the required calibration sample can be substantially reduced by optimizing its 
redshift distribution. We hope this study will help stimulate work on better understanding of photo-z 
errors. 

Subject headings: cosmology - gravitational lensing, large-scale structure of the universe 



1. INTRODUCTION 

Explaining the Hubble acceleration, i.e. the "dark 
energy," is one of the main challenges to cosmologists. 
Weak gravitational lensing (WL) has perhaps the most 
potential to constrain dark energy parameters of any ob- 
servational window, but is a newly developed technique 
which could be badly degraded by systematic errors (Al- 
brecht et al 2005). A WL survey requires an estimate 
of the shape and the redshift of each source; dominant 
observational systematic errors are expected to be errors 
in galaxy shape due to the uncorrected influence of the 
point spread function (PSF) and errors in estimation of 
redshift distributions if they are determined by photo- 
metric redshifts (photo-z's). Interpretation of WL data 
could also be systematically incorrect due to errors in 
the theory of the non-linear matter power spectrum or 
intrinsic alignments of galaxies. In this paper we present 
a new and more general analysis of the effect of photo-z 
calibration errors and of the size of the spectroscopic sur- 
vey required to reduce photo-z errors to a desired level. 

Recent work has addressed many of these po- 
tential systematic errors in WL data and the- 
ory: from the co mputation of the nonlinear mat- 
ter p o wer spectrum (IVale fc White! 120031: IWhite fc Vald 
20041: iHeitmann et all 120051: IHuterer fc Takadal \ 200l 



12001 [Shapiro fc Coora3 12001 : and the presence of 
dust (jVale et al.l l2004h . The promise and prob- 
lems of WL have stimulate d work on how t o im- 
prove the PSF reconstruction (| Jar vis fe~ Jain 2004) , esti- 
mate shear from noisy i mages ([Bernstein fc Jarvisll2002l; 
Hirata fc Seliakl 120031: iHoekstral [20041: iH cvman s et al.l 
20061: iNakaiima fc Bernstein! 120071 iMassev et al.ll2007f) . 
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Hag an. Ma fc Kravtsovl l2005MLinder fc White! 
Ma! 120061 : iFrancis et all l2007f ): from baryonic 
ing and pressure forces o n the distribution of large- 
scale structure s (IWhitel 12004 IZhan fc Knoxl 120041 : 
Uing et all [2001 iRudd et alj|2008t IZentner et alj 120081 ); 
approximations in i nferring the shear from the maps 
([Dodelson fc Zhand [20051: IWhitel [20051 : iDodelson et all 
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and protect against e rrors in the theoretical power spec- 
trum at small scales (jHuterer fc W hite 2005). 

For visible-NIR WL galaxy surveys, the dominant sys- 
tematic error is likely to be inaccuracies in the photo-z 
calibration. The effect of photq - z calib r ation on weak 
lensin g is studied by iMa et alj (I2006T): IHuterer et all 
( 2006 ); I Jain et alj |2007l) ; lAbdalla et all (|2007l ); and 
iBridle fc Kind ([20071 ). The distributions of photo-z er- 
rors assumed for these studies are, however, much sim- 
pler than will exist in real surveys ( Dah len et all 120071; 
lOvaizu et aUl2007t IWittman et aLll2007h IStabenau et al l 
120071 ). IHuterer et al.l (|2006l ) assumed that photo-z errors 
take t he form of simpl e shifts (a bias that varies with z), 
while IMa et all ((2006) assume the photo-z error distri- 
bution is a Gaussian, with a bias and dispersion that are 
functions of z. These studies find that dark energy con- 
straints are very sensitive to the uncertainties of photo-z 
parameters. A spectroscopic calibration sample of galax- 
ies on the order of 10 5 is required to have less than 50% 
degradation on dark energy constraints. In this work we 
relax the Gaussian assumption, presenting a method to 
evaluate the degradation of dark energy parameter accu- 
racy versus the size of the spectroscopic calibration sur- 
vey, for the case of a photo-z error distribution described 
by any parameterized function. We then apply this to a 
model in which the core of the photo-z error distribution 
is the sum of multiple Gaussians, ignoring for now the 



effect of so-called catastrophic photo-z errors or outliers. 

The outline of the paper is as follows. In Sj2]wc intro- 
duce the formalism and parameterizations of cosmology, 
galaxy redshift distributions and photometric redshift er- 
rors. The implementation of the formalism is detailed in 
S|3] We show the dependence of the size of the calibra- 
tion sample on the number of Gaussians and the shapes 
of the fiducial photo-z models in $4] We illustrate the 
effectiveness of optimizing the calibration sample in 
We discuss our results and conclude in J)J 

2. METHODOLOGY 

Two majo r gene ralizations are made to the work done 
in lMa et~a l. (2006). One is that we do not assume a pri- 
ori knowledge of the true underlying (unobserved) galaxy 
redshift distribution n(z). Instead, we treat it as an un- 
known function which must be constrained by the photo- 
z distribution n(z p h) and other observables. The other 
modification we make is to generalize the photo-z proba- 
bility distribution to generic parametric functions, in our 
case multiple Gaussians. 

2.1. Galaxy Redshift Distributions and Parameters 

One of the observables that a weak lensing survey 
would provide is the galaxy photo-z distribution n(z p h). 
The corresponding galaxy true redshift distribution n(z) 
is unknown. These two galaxy redshift distributions are 
related by the photo-z probability distribution P(z p h\z), 



n(z ph ) = / n(z)P(z ph \z)dz . 



(1) 



In practice, we model the true n(z) as a linear interpola- 
tion between values n % at a discrete set of redshifts {z*}. 
The n l become free paramete rs in a fit to the observab les. 

Weak-lensing tomography (|Hdll999tlHutererll2002fl ex- 
tracts temporal information by dividing n(z p h) into a few 
photo-z bins. The true distribution of galaxies rii(z) that 

fall in the iih photo-z bin with z p ^ < z p h < z ph be- 
comes 



ni(z) 



- P h 



dz ph _n(z) P(z ph \z) . 



(2) 



iMa et ah! (|2006j ) had taken P(z p h|z) to be a Gaussian, 
described by two parameters (redshift bias and rms) at a 
given value of z. Now we allow a generic dependence on a 
set of photo-z parameters p^ indexed by fx, P(z p h\z;p f j,). 
For a multiple Gaussian photo-z model, p^ are the biases 
and rms values of the component Gaussians. 
The total number of galaxies per steradian 



n" I dzn(z) , 



fixes the normalization, and we analogously define 



"', ' = / dzni(z) 
'o 



(3) 



(4) 



for the tomographic bins. 



2.2. Observables 

We utilize information from both lensing and redshift 
surveys which include galaxy photo-z distribution and 
the spectroscopic calibration sample for the photo-z's. 



2.2.1. Lensing Cross Spectra 

Following IMa et all (|2006j) , we choose the number- 
weighted convergence power spectra nfnfP^j(£) as lens- 
ing o bservab l es 1 , where i and j label tomographic bins. 
From iKaiserl (|1992L Il998| ) we have 



nfnfPm 



dzWiWWjW^Qpfaz), (5) 



where H(z) is the Hubble parameter, D(z) is the angu- 
lar diameter distance in comoving coordinates, P(kg,z) 
is the three-dimensional matter power spectrum, and 
kg = t/D{z) is the wavenumber that projects onto the 
multipole I at redshift z. The weights W are given by 



Wi(z): 



2 H(z) y ' 

' , D LS (z,z') 

dzn ^ z) D(z>) 



(6) 



where Dl$(z, z') is the angular diameter distance be- 
tween the two redshifts. We co mpute a power sp e ctrum 
from the transfer function of lEisenstein fc Hul (|1999h 
with dark energy modificati ons fromlHul (120021 ) and the 
nonlinear fitting function of iPeacock &: Dod"dsl ()1996h . 

2.2.2. Photo-z Distribution 

Another set of observables from the redshift surveys 
is the galaxy photo-z distribution, n(z ph ), collected into 
bins. The width (5z p h of these bins would typically be 
much finer than the tomography bins and should be at 
least as fine as the nodes z l on which the true redshift 
distribution is defined. Binning equation[Tl we have 



(7) 



n(z ph )Sz ph = / n(z)P(z ph \z;p ll )8z ph dz . 



So the observables are functions of the intrinsic distribu- 
tion {n 1 } and the photo-z parameters p^. 

2.2.3. Spectroscopic Redshifts 

The last piece of information we utilize is the spec- 
troscopic calibration sample. We presume that a repre- 
sentative sample of N^ pect galaxies has been drawn from 
the sources in redshift bin i, with spectroscopic redshifts 
determined for all of them. Equivalently, we can de- 
mand that the failure rate for obtaining redshifts in the 
spectroscopic survey must be completely independent of 
redshift. The likelihood of the jth spectroscopic survey 
galaxy with photo-z value z ph being observed to have 

spectroscopic redshift z^ is of course P(z ph \z J ;p M ). Each 
spectroscopic redshift hence adds a little more constraint 
to the photo-z parameters, as quantified in £12.3.31 below. 
We presume all the spectroscopic z values are indepen- 
dent, i.e. we ignore source clustering. While this may 
be unrealistic in practice for spectroscopic surveys over 
small areas of sky, it is more likely — and adequate — that 
the redshift errors are uncorrelated, so that we can con- 
strain P(z p h — z\z) with Nl t independent samples. 

1 Since we are using all the information from the galaxy number 
distribution in this study, one could equally well use as lensing 
observables. 
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We have considered the spectroscopic sample to con- 
strain P(zph\z), which can combine with photo-z counts 
n ( z ph) t° constrain the true redshift distribution n(z). 
One could potentially assume the spectroscopic sample 
to sample and constrain n{z) directly. We avoid this for 
two reasons. First, claiming both uses for the spectro- 
scopic sample would be "double counting" its informa- 
tion. Second, a direct constraint of n(z) would depend 
heavily on the assumption that the calibration sample is 
a fair representation of the full photo-z sample. Source 
clustering in the spectroscopic sample would be more of 
an issue. In addition, we investigate below the possi- 
bility of targeting calibration samples at rates that vary 
with redshift. In this situation, the calibration sample 
could deviate from the true underlying galaxy redshift 
distribution by quite a bit. 

It remains crucial, in any case, that the calibration 
sample is a fair representation of the photo-z sample 
within each redshift bin and for every galaxy type. For 
example, if we arc taking spectra for 5% of the photo-z 
sample in some redshift bin, we must be sure to draw 5% 
of the blue galaxies and 5% of the red galaxies for our 
complete spectroscopic survey and succeed in obtaining 
redshifts for all regardless of color. 

2.3. Fisher Matrix 

The Fisher matrix quantifies the information contained 
in the observables. The total Fisher matrix is the sum of 
that from each of three kinds of (uncorrelated) observ- 
ables: the lensing shear, the observed photo-z distribu- 
tion, and the spectroscopic redshift distribution, 



j^total rplens _■_ j? n ( z ph) . respect 

r fj,i> — r fii> "r r "r r fiv ' 



(8) 



galaxies arcmin -2 , and 7; n t = 0.22. This is what might 
be expected from an ambitious ground-based survey like 
the Large Synoptic Survey Telescope (LSST). 2 

For the cosmological parameters, we consider four pa- 
rameters that affect the matter power spectrum: the 
physical matter density Q, m h 2 {= 0.14), physical baryon 
density £L b h 2 (= 0.024), tilt n s (= 1), and the amplitu de 
<5 C (= 5.07 x 10~ 5 ; or A = 0.933 iSpergel et all (|200l ). 
Values in parentheses are those of the fiducial model. 
To these four cosmological parameters, we add three 
dark energy parameters: the dark energy density f2DE(= 
0.73), its equation of state today wq = Pde/pde|z=o(= 
— 1) and its derivative w a — ~dw/da\ z= o(= 0) assuming 
a linear evolution with the scale factor w — wo+(l—a)w a - 
Unless otherwise stated, we shall take Planck priors on 
these seven parameters (W. Hu, private communication) . 

2.3.2. Photo-z Distribution 

The F"( Zph ' quantifies the information contained in the 
galaxy photo-z distribution. We use the model of equa- 
tion ([7]) to find the dependence of each observable n(z ph ) 
on the true redshift and photo-z parameters. Each bin is 
presumed to have Poisson uncertainties 



cH4h) fc pii) = K^ph^pd 2 



(13) 



and the errors on the parameters are given by Ap M = 

r-T7\total-\ — 1/2 



In practice, the number of photo-z's will be large, and 
jr , ™( z ph) ac ^ s like a linear constraint on the other param- 
eters. 

2.3.3. Spectroscopic Redshifts 

The F spect quantifies the information contained in the 
spectroscopic calibration sample on photo-z parameters 
p^. The simple likelihood analysis of Appendix A shows 
that the Fisher matrix from the spectroscopic survey is 



2.3.1. Lensing Cross Spectra 

The F lens quantifies the information contained in the 
lensing observables 

O a (£) = nfnfP^l) , (a = {ij}, i > j) (9) 

on a set of cosmological, photo-0 parameters p^ and the 
underlying galaxy redshift distribution parameters. Un- 
der the approximation that the shear fields are Gaussian 
out to < maJI the Fisher matrix is given by 



F, 



lens 



E 



(^^Ef^i-fr- (10) 

1=2 ab ° Vv 

Given shot and Gaussian sample variance, the covari- 
ance matrix of the observables becomes 

Cab = n?nfn£nf (P™P*? + , (H) 

where a = {ij} and b = {kl}. The total power spectrum 
is given by 



ptot pK I c / mt 



(12) 



where 7; n t is the rms shear error per galaxy per compo- 
nent contributed by intrinsic ellipticity and measurement 
error. For illustrative purposes we use f max = 3000, / s ky 
corresponding to 20, OOOdeg 2 , n A corresponding to 30 



rpspect 
lit* 



l^pect 



dz 



1 



Qpt Qpt 



pll 



P i (z p h.\z) dpp dp u ' 



(14) 



where N* pcct spectra have been obtained from redshift 

bin i (out of iV pz ) and P l describes the photo-z errors for 
this bin. 

3. IMPLEMENTATION 

We now apply the above formalism to derive Fisher 
matrices for specific cases of WL surveys and their as- 
sociated spectroscopic calibration surveys. In further 
sections we vary the parameters of the photo-z errors 
and the spectroscopic survey and investigate the impact 
on the accuracy of dark energy parameters derived from 
each survey. 

Following Ma ct al. (2006), the fiducial galaxy redshift 
distribution n(z) is chosen to have the form 



i(z) cc z a exp [— (z/zq) 13 ] 



(15) 



Unless otherwise stated we adopt a = 2 and (3=1 and 
fix zq such that the median redshift is z mo d = 1- The 
parametric model for n(z) is determined by linear inter- 
polation between N pz = 31 values ri 1 = n(z l ) at equally 
spaced redshifts between and 3. 

2 See http://www.lssto.org 
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In the Gaussian case as assumed in iMa et~al"1 ((2006), 
we have 



i 



27TCT; 



■ exp 



(^ph Z ^bias) 



2ol 



(16) 



The bias Zbias and dispersion o~ z are functions of z. 

In reality, P(z p h|z) could be far more complex than a 
single Gaussian. We explore this complexity by assuming 
P(z p h\z) as the sum of Gaussians. Using N g Gaussians 
to describe P{z p \ ] \z), we have 



p{*M = 



n 9 

E 

.7=1 



C, 



x exp 



(z ph 



-^bias;j ) 



(17) 



where Cj is the normalization of the j Gaussian. Since 
we assume P{z p \ l \z) is normalized to unity, we have 
J2jCj = 1. We allow the biases Zbi a s;j(^) an d scat- 
ters a z .j(z) to be arbitrary functions of redshift. The 
redshift distribution of the tomographic bins defined by 
equation[2] can then be written as 



1 

-n(z) Q[erf(a; i+ i ;i ) - erf^y)], (18) 



with 



ph 



2 + 2 b i : 



(19) 



where erf (x) is the error function. 

In practice, we represent the free functions Zbi aS ;j(^) 
and a z -j(z) by linear interpolation between values at a 
discrete set of N pz redshifts equally spaced from z = to 
3. The photo-z parameter set {Pfj} is hence the 2N g N pz 
values of the biases and dispersions of the Gaussians at 
these nodes. 

With multiple Gaussians, we can describe a wide va- 
riety of photo-z probability distributions P(z p h.\z). Fig- 
ured] shows a few examples of P{z p \ l \z). A wide variety 
of behaviors can be represented, including "catastrophic" 
outliers. Although catastrophic photo-z errors could po- 
tentially have a big i mpact on what we can ge t out of 
cosmic shear surveys ()Amara. fc Refregierll2007j) . we re- 
strict ourselves to studying the core of P(z p h\z) in this 
st udy. 

IMa et alJ (|2006f ) show that N pz = 31 between z = 
and 3 gives enough freedom to the photo-z parameters to 
destroy all tomographic information. Since we are giving 
the photo-z even more freedom by allowing P{z p \ ] \z) to 
be multiple Gaussians, N pz = 31 should be large enough. 
Unless stated otherwise, we use N pz = 31. Thus, the 
total number of photo-z parameters is 62N g . 

The observables n(z ph ), determined in bins of width 
Sz p h, need not have the same bin width as the spacing of 
the n(z l ) or the photo-z parameters. In fact, they should 
be more finely spaced. We choose the size of <5z p h such 
that further dividing it by two does not lead to anymore 
information gains. We find that <5z p h = 0.0125 is small 
enough for all the photo-z models explored in this study. 



4. SIZE OF THE SPECTROSCOPIC CALIBRATION 
SAMPLE 

In this section we investigate the size of the spectro- 
scopic calibration sample required to limit photo-z sys- 
tematics to some desired level. In particular, we are in- 
terested in the increased demands that might result from 
giving the photo-z distribution freedom to depart from 
a single-Gaussian form. We first demonstrate that, for a 
fixed fiducial photo-z model, the required calibration size 
increases with the number of degrees of freedom (2N g ) 
that we allow for deviations from the fiducial model. This 
increase reaches an asymptotic limit with N g . 

Second, we investigate how the required N spcct varies 
as we allow the fiducial model to assume non-Gaussian 
shapes. Equations lA-9l and lA-10l show that in the case of 
a Gaussian distribution, the N spcct required to constrain 
the photo-z parameters is proportional to the square of 
the width of the distribution. In the following, we hold 
the width (defined as the rms) of the fiducial photo-z dis- 
tributions to be 0.05(1 + z). Holding this fiducial width 
fixed means that any variations we see are due only to 
variations in the shape of the photo-z probability distri- 
bution. 

We use the error degradations in w a (that is, errors 
in w a relative to the error with perfect knowledge of the 
photo-z parameters) as the measure of dark energy degra- 
dations. The error degradations in w p 3 are about 30-50% 
lower and follow the same trend as that of w a . Roughly 
speaking, the figure of merit ado pted by the Dark En- 
ergy Task Force (|Albrecht et al.l l2006h will degrade as 
the square of the dark energy degradation used here. 

In this section we assume that the N spect total spectro- 
scopic galaxies are selected uniformly in redshift between 
and 3. 

4.1. Dependence on the Number of Gaussians N g 

The left panel of Figure[2] plots the dark energy degra- 
dation versus the size of the spectroscopic calibration 
sample, when the photo-z error distribution has N g = 1, 
2, 3, and 4. The fiducial biases and dispersions are 
the same for all component Gaussians. So the fiducial 
P(z p h\z) is identical in all cases, but with higher N g , 
there is more freedom for deviations from the fiducial. 
The second, third, and fourth Gaussian components are 
each fixed to have one-fourth the total normalization of 
the distribution. 

At fixed dark energy degradation, the required size of 
the calibration sample (N spcct ) increases with the num- 
ber of Gaussians and reaches an asymptotic value when 
N g « 4. When dark energy degradation is 1.5, the 
N s = 4 photo-z model requires « 5 times the calibra- 
tion sample of the N s = 1 model. 

Another view is that the dark-energy uncertainties will 
be underestimated if one fits a single-Gaussian model 
to photo-z distributions that actually require more free- 
dom. For example, assume we obtain 4 x 10 4 spectra, 
as required to keep dark energy degradation under 1.5 
for a single-Gaussian photo-z model. We find, however 
that the dark energy degradation for N g — 4 rises above 
2.0. So relaxing the Gaussian assumption for photo-z's 
inflates the cosmological uncertainties by « 35%. 

3 We have w p = w(z = z p ), where z p is the redshift at which 
the errors of uiq and io a are decorrelated. 
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Fig. 1. — Examples of photo-z probability distribution P(z p i l \z). From left to right, they are two Gaussians with different biases, two 
Gaussians with different a z values, three Gaussians with parameters randomly generated, and three Gaussians with one being catastrophic. 
The thick solid lines are the total P(z ph \z), and the thin dotted lines are the individual Gaussians that build up P(z ph |z). 




10 2 10 3 10 4 10 5 10 6 10 7 10 2 10 3 10 4 10 5 10 6 10 7 



Spectroscopic calibration sample N spect Spectroscopic calibration sample N, 



Fig. 2. — Left: The Aspect requirement for the same fiducial photo-z distribution being modeled using different numbers of Gaussians. 
These Gaussians differ only by their normalizations, whose ratios are shown in the legend (e.g., "2G 3:f" means a two-Gaussian model with 
normalization ratio of 3 to 1.). All Gaussian components have fiducial Zbias = an d c 2 = 0.05(1 + z), so all cases have the same fiducial 
distribution while the fitted photo-z error distribution gains more freedom. The solid line is for the case of single Gaussian (A^ 9 = 1). 
Survey specs are LSST-like. Right: Thin lines are the same as those in the left panel but with <r z = 0.03(1 + z). For comparison, the 
four-Gaussian model in the left panel is plotted as the thick dotted line (magenta). The thick solid line (red) is the single-Gaussian model 
with 5Wj4P-like survey specs 

(/sky = 4000 deg 2 , n A = 100 galaxies arcmin 2 , and 7; nt = 0.22). 



We also note from the left panel of Figure [2] that the 
dark energy degradation has a characteristic dependence 
on Aspect; for -/V S p CC t > 10 3 , the dark energy parame- 
ter error scales roughly as N^ ct . When the dark en- 
ergy degradation reaches ss 1.2, at -/V S p CCt = 10 5 -10 6 , 
the gains from additional spectra become weaker and 
a degradation of unity is approached only very slowly. 
As we vary N g , we change the location of this "knee" 
in the curve, but not the scaling for N spQCt below the 
knee. This scaling is not sensitive to either the fiducial 
photo-z models or survey specs. For example, as shown 
in the right panel of Figure [2l for a photo-z model with 
o z = 0.03(1 + z), the scaling is N^ ct ; for a SNAP- 
like survey 4 with / s k y = 4000 deg 2 , n A — 100 galaxies 

arcmin -2 , and 7i n t = 0.22, the scaling is also Aspect as 
shown in the right panel of Figure [2] 

4 See |httpi7/ snap, lbl.gov 



The desired spectroscopic survey size N spcct will in 
general depend on the width and shape of the fiducial 
photo-z distribution, not just N g . We next investigate 
the dependence on the detailed shape of the fiducial dis- 
tribution. 

4.2. Dependence on the Fiducial Photo-z Models 

The left panel of Figured shows dark energy degrada- 
tion versus N spcct for several N g = 2 models, all having 
fiducial rms width 0.05(1 + z), but with different fiducial 
biases and dispersions for the two components. In de- 
tail, our study includes fiducial photo-z distributions in 
which: the component Gaussians have the same biases 
but different a z values ("2G a z cliff" model); the same 
<7 Z values but different biases ("2G zbias diff"); the same 
biases and o~ z values but with normalizations at a 3 to 
1 ratio ("2G 3:1"); and 10 models in which the fiducial 
•Zbiasy and a z -j are randomly assigned while maintaining 
fixed rms photo-z error ("2G seed xxx" models). 
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The Aspect requirements span a rather large range. For 
example, at 50% dark energy degradation, most of the 
photo-z models' Af spec t requirement is within a factor of 
4 of that of the single-Gaussian model. But some of the 
models require 40 times more N spect . Three- and four- 
Gaussian photo-z models exhibit similar behaviors. 

To understand the wide range of N spcct requirements 
for different photo-z models, we perform the following 
test. We fix the underlying galaxy redshift n(z) and 
do not use any information from n(z p h). The result- 
ing iV S pect requirements for the double-Gaussian photo-z 
models are shown in the middle panel of Figure[3] At 
fixed dark energy degradation, the range of N spcct re- 
quirements is greatly reduced. For example, at 50% dark 
energy degradation, the Aspect requirement is within a 
factor of 2 of that of the single- Gaussian model. We 
find similar reduction of the range of N spoct require- 
ments in the case of three- and four-Gaussian models. 
The test shows that the reason for the wide range of 
Aspect requirements for different photo-z models is that 
n(^ p h) constrains the underlying galaxy redshift distri- 
bution and the photo-z parameters much better in some 
of the photo-z models than others. It is the redshift 
knowledge, rather than weak-lensing information itself, 
that is sensitive to the details of the photo-z probability 
distribution. 

One possible cause of the poor sensitivity in some 
photo-z models is the rapid variation of photo-z param- 
eters in redshift. The right panel of Figure[3] shows the 
result of reducing the degree of rapid variation of the 
photo-z parameters. The range of N spcct is reduced to 
within a factor of 4 of that of the single-Gaussian model 
as shown in right panel of Figure[3] In detail, we demand 
that the fiducial photo-z parameter Zbias and a z values 
to be proportional to 1 + z within each of the three red- 
shift intervals with width 8z = 1. The proportionalities 
are generated randomly. These photo-z models are much 
smoother than those randomly generated in the left panel 
of Figure[3l This test shows that n(z p h) is less effective 
in constraining the underlying galaxy redshift distribu- 
tion and photo-z parameters when the photo-z model is 
rapidly varying. In reality, photo-z parameters would 
most likely show smooth variations in redshift. The re- 
quired calibration sample is expected to be within a fac- 
tor of a few times that of the single-Gaussian fiducial 
model. 

We point out that multi-Gaussian cases may require 
fewer spectroscopic calibration galaxies than the single- 
Gaussian case. As an example, examine the photo-z 
model with double Gaussians whose <r z values arc differ- 
ent. Its Aspect requirement is shown in Figure[3] (left) 
using the dotted blue line. Since we keep the width 
of P(z p h\z) fixed, one of the Gaussians in the double- 
Gaussian photo-z model is narrower than the width of 
P(z p h\z) and the other Gaussian is broader. The nar- 
rower Gaussian tends to reduce the N spcc t requirement, 
while the broader one tends to do exactly the opposite. 
The outcome of these competing effects could be either a 
smaller or larger requirement of the calibration sample. 
For this particular photo-z model, the required N spcct 
crosses that of the single- Gaussian model (shown as the 
thick solid red curve in Fig[3]left). 

We note that the generic behavior a Wa cx N^'^f 



continues to hold for all the fiducial distributions, until 
the dark energy degradation drops to 1.2—1.3. This in- 
flection typically occurs with a few times 10 5 spectra, for 
the LSST survey parameters assumed here. 

5. OPTIMIZING THE SPECTROSCOPIC CALIBRATION 
SAMPLE 

So far we have been assuming that the calibration sam- 
ple is uniformly distributed in redshift. Weak lensing 
may require more precise photo-z calibration at some 
redshifts than others. It could be beneficial if we dis- 
tribute the calibration sample according to lensing sen- 
sitivity. Our goal is to find the N* pcct that leads to the 
best dark energy constraints for a fixed spectroscopic ob- 
serving time T b s . This could be modeled as 

(Uncertainties in dark energy parameters) = 

function(N l spect ,i = l,2,...), (20) 

]TA^ cctCOS ^) = T obs , (21) 

i=l 

where cost(z l ) is the time it takes to obtain the spectrum 
of a galaxy at redshift z % . This is a constrained nonlinear 
optimization problem. To calculate the function in equa- 
tion!^ we first calculate the Fisher matrices F lens and 
p™( z ph) f or th e presumed survey. Then for each trial set 
of N* pect , we calculate i^ s P° ct using eauation[T4l sum the 
Fisher matrices, and forecast the dark energy uncertain- 
ties. As to the constrain equation ([2Tj) , we need to know 
the cost function. For illustrative purposes, we assume 
cos^z 1 ) is a constant. 

As an example we choose a calibration sample of 37,500 
galaxies and assume a single-Gaussian photo-z model. If 
this calibration sample is uniformly distributed in red- 
shift, dark energy degradation is 56%. If instead we use 
a downhill simplex method to find the spectroscopic red- 
shift sampling distribution that minimizes the dark en- 
ergy uncertainties for a fixed total number of redshifts, 
we obtain the distribution shown as the histogram in 
Figured! The optimized redshift sampling lowers the 
dark energy degradation to 38%. That is a 18% gain in 
dark energy precision at fixed investment of spectroscopy 
time. From a different prospective, to reach 38% dark en- 
ergy degradation with a uniformly distributed calibration 
sample, 69, 000 galaxy spectra are required. So optimiza- 
tion saves 46% of the spectroscopic observing time for 
fixed cosmological degradation. Multi-Gaussian photo-z 
models exhibit very similar behaviors. 

We do not know exactly why the optimized calibra- 
tion sample distribution is not very smooth. It would 
be rather difficult to plan the observation to match this 
distribution. Fortunately, a smooth distribution like the 
one shown using the blue dashed line in Figure[4]produccs 
44% dark energy degradation, which is a moderate im- 
provement over the uniform case. 

6. CONCLUSION AND DISCUSSION 

We explore the dependence of cosmological parame- 
ter uncertainties in WL power-spectrum tomography on 
the size of the spectroscopic sample for the calibration of 
photometric redshifts. We present a formula that is valid 
for arbitrary parameterizations of the photo-z error dis- 
tribution and then apply this to a multi-Gaussian model 
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Fig. 3. — The N spcc t requirement for different fiducial photo-z mod els with N g = 2, including some with randomly generated fiducial 
photo-z error distributions. Details of the photo-z models are in § 14.21 Left: Information from the galaxy photo-z distribution n(z p h) and 
the spectroscopic calibration sample are used to constrain the underlying galaxy redshift distribution n(z) and photo-z parameters. Middle: 
Same as the left panel, except that information from n(z p h) is not utilized and n(z) is assumed to be known a priori. Right: Within each 
of the three Sz = 1 intervals, the randomly generated fiducial Zbias an d cr z values increase linearly with 1 + 2. The proportionalities are 
generated randomly. In all three panels, the thick solid red line is for the case of a single Gaussian (JV 9 = 1). 

The exact value of dark energy degradation versus 
Aspect depends significantly on the shape of the fidu- 
cial distribution, even when the total rms photo-z error 
is held fixed. For the case of the LSST survey with rms 
photo-z error 0.05(1 + z), we find that the "knee" at a 
dark energy degradation of 1.2-1.3 occurs in the range 
iV spoct « 10 5 -10 6 . 

For photo-z models described by nondegenerate Gaus- 
sians, the size of the calibration sample varies by as much 
as 40 times among the 14 models studied. Most of the 
variation is caused by the different ability of the galaxy 
photo-z distribution n(z p h) to constrain the underlying 
galaxy redshift distribution and the photo-z probability 
distribution. These photo-z models whose parameters 
vary rapidly in redshift are the ones that are least con- 
strained. In reality, photo-z parameters are expected to 
be smoothly varying in redshift. The -/V spoc t requirement 
would be only a factor of a few from that of the single- 
Gaussian fiducial distribution. 

Finally, we show that the size of the calibration sample 
can be effectively reduced by optimization. In a simple 
example, an optimized calibration sample of 37,500 red- 
shifts was able to reach the same dark energy degradation 
as a sample of 69,000 galaxies uniformly distributed in 
redshift. 

We restrict this study to the effect of the core of the 
photo-z distributions. Catastrophic photo-z errors could 
potentially be very damaging. The methodology pro- 
vided in this study is applicable to study the effect of 
catastrophic photo-z errors. We leave this to future work. 

The methodology we use assumes that the spectro- 
scopic survey is a fair sample of the photo-z error dis- 
tribution and is the only information available on the 
photo-z error distribution. Since we have used a Fisher 
matrix technique, no photo-z estimation method, regard- 
less of technique (neural net, template fitting, etc.) can 
surpass our forecasts under these conditions. 

The calibration's success depends crucially on the spec- 
troscopic redshifts being drawn without bias from the 
redshift distribution of the photometric sample it rep- 
resents. The survey strategy must be carefully formu- 
lated to make sure that this occurs. Differential incom- 



FlG. 4. — Histogram: Optimal iVs P ect distribution in redshift 
for single-Gaussian model. Dark energy degradation is 56% if this 
sample is distributed uniformly in redshift. The N S pect distribution 
in this figure lowers dark energy degradation to 38%, which would 
require 69, 000 galaxy spectra to calibrate if the distribution is flat 
in redshift. Blue dashed line: Smooth fit to the histogram. The 
dark energy degradation is 44% if this calibration sample is used. 
For both the histogram and the smooth fit, the calibration sample 
has 37,500 galaxies. 



to see whether previous works' assumptions of simple 
Gaussian photo-z errors were yielding accurate results. 

Indeed, we find that the required -/V spcc t under the sim- 
ple Gaussian model is increased « 4 times when we al- 
low more freedom in the shape of the core of the photo-z 
distribution. Fortunately, there appears to be an asymp- 
totic upper limit as we add more photo-z degrees of free- 
dom. 

We also find a generic behavior d log a/d log AT spect = 
0.20-0.25, where a is the uncertainty in a dark energy pa- 
rameter, in the regime where a is degraded 1.2-5 times 
compared to the case of perfect knowledge of the photo- 
z distribution. Hence, the fourfold increase in required 
iV S pect from relaxing the Gaussian assumption is equiva- 
lent to a w 1.3 times degradation in a at fixed -/Vs pec t. 
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pleteness between, say, red and blue galaxies or redshift 
"deserts" , must be avoided. This has not been achieved 
by any large redshift survey beyond z w 0.5 to date. 

It may be possible to constrain P{z v \ l \z) by other 
means in the absence of a fair spectroscopic sample of the 
size we specify. One could invoke astrophysical assump- 
tions, namely, that the spectra of faint galaxies are iden- 
tical to those of brighter galaxies, in an attempt to boot- 
strap a fair bright sample int o a calibration f or fain ter 
galaxies. Another suggestion (|Schneider et al.l (|2006f) : J. 
Newman, private communication) is that the photomet- 
ric sample be cross-correlated with an incomplete spec- 
troscopic sample to infer the redshift distribution of the 
former. It remains to be seen, however, whether these 
techniques can attain the accuracy needed to supplant 
a direct fair sample of > 10 5 spectra. This would re- 
quire some a priori bounds on the evolution of galaxy 
spectra and the clustering correlation coefficients of dif- 
ferent classes of galaxies. We look forward to future 
progress in these techniques, keeping in mind that the 
demands for precision cosmology from WL tomography 
are much more severe than the demands that galaxy evo- 
lution studies typically place on photometric redshift sys- 
tems. 
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APPENDIX: DERIVATION OF EQUATION 14 

If one draws N events from a sample with probability 
distribution function P(x;6), where the components of 
6 are the parameters specifying the distribution and x is 
the variable whose probability distribution is under con- 
sideration, what are the constraints on the parameters 
61 

Let us first divide x into small bins and label the 
width of the bins as Axi. The number of events that 
fall in the ith bin is Poisson distributed with mean 
Ni — NP(xi]9)Axi. The likelihood function can be ex- 
pressed as 



L oc 



n 



exp(-N t )N, 



(A-l) 



and the natural logarithm of L is, 



C = -InL = ^N t 



NilnNi + InNil + const . (A-2) 



The derivatives of C with respect to the model parame- 
ters 9 are 

w^^V-w, dK (A " 3) 



d 2 C 

The Fisher matrix is, 
d 2 C 



E 



Ni dNi ON, 

J 2 de^ dd v 

NA d 2 N t 



N J ae^ae, 



(A-4) 



F 



E 



1 dN ON, 



=E 



d9»d9 v I ^Ni 86^ 89, 

NAx, dP{x l \0)dP{x i ;0) 
P( Xi ; 6) 36^ d0 v 

1 dP(x; 6) dP(x; 0) 



N / dx 



P(x; 6) 89^ 



dQ v 



(A-5) 



In the special case where P(x; 6) is a Gaussian with 
mean /i and spread er, 



P{x;/j,a) 



1 



2na 



exp 



{x-tf 



2a 2 



we have 



dP x — (i 
d[i a 2 



P and, 



(A-6) 



(A-7) 



dP P (x - fif 



P. 



da a a 3 
Plugging these results into equation lA-51 gives us 



(A- 



F ua = N dx 



(x-ii) 2 D _ N 



P = — and, (A-9) 



dxP 

-OO 



(T 3 (T 



2 4- (A-10) 



Note that F^ ia = since the integral only involves odd 
powers of x — ji. 
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